Audio Anti-Spoofing Based on Audio Feature Fusion

نویسندگان

چکیده

The rapid development of speech synthesis technology has significantly improved the naturalness and human-likeness synthetic speech. As technical barriers for are rapidly lowering, number illegal activities such as fraud extortion is increasing, posing a significant threat to authentication systems, automatic speaker verification. This paper proposes an end-to-end detection model based on audio feature fusion in response constantly evolving techniques improve accuracy detecting uses pre-trained wav2vec2 extract features from raw waveforms utilizes module back-end classification. aims by adequately utilizing extracted front end fusing information timeframes dimensions. Data augmentation also used enhance performance generalization model. trained training sets logical access (LA) dataset ASVspoof 2019 Challenge, international standard, tested deep-fake (DF) evaluation datasets 2021 Challenge. equal error rate (EER) LA DF 1.18% 2.62%, respectively, achieving best results dataset.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Audio Feature Selection Based on Rough Set

Keeping audio features is important for audio index. However, in most cases the features number is huge, thus direct processing is time-consuming. Feature selection, as a preprocessing step of data mining, has turned to be very efficient in reducing dimensionality and removing irrelevant data. In this paper, we propose a feature selection algorithm based on Rough Set theory, which could find ou...

متن کامل

Cross-Database Evaluation of Audio-Based Spoofing Detection Systems

Since automatic speaker verification (ASV) systems are highly vulnerable to spoofing attacks, it is important to develop mechanisms that can detect such attacks. To be practical, however, a spoofing attack detection approach should have (i) high accuracy, (ii) be well-generalized for practical attacks, and (iii) be simple and efficient. Several audio-based spoofing detection methods have been p...

متن کامل

Improvement of Information Fusion Based Audio Steganalysis

In the paper we extend an existing information fusion based audio steganalysis approach by three different kinds of evaluations: The first evaluation addresses the so far neglected evaluations on sensor level fusion. Our results show that this fusion removes content dependability while being capable of achieving similar classification rates (especially for the considered global features) if com...

متن کامل

Dominant Feature Vectors Based Audio Similarity Measure

This paper presents an approach to extracting dominant feature vectors from an individual audio clip and then proposes a new similarity measure based on the dominant feature vectors. Instead of using the mean and standard deviation of frame features in most conventional methods, the most salient characteristics of an audio clip are represented in the form of several dominant feature vectors. Th...

متن کامل

Feature-Level Decision Fusion for Audio-Visual Word Prominence Detection

Common fusion techniques in audio-visual speech processing operate on the modality level. I.e. they either combine the features extracted from the two modalities directly or derive a decision for each modality separately and then combine the modalities on the decision level. We investigate the audio-visual processing of linguistic prosody, more precisely the extraction of word prominence. In th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Algorithms

سال: 2023

ISSN: ['1999-4893']

DOI: https://doi.org/10.3390/a16070317